Support RAID configuration for baremetal server#292
Conversation
043f835 to
48ef146
Compare
|
Can one of the admins verify this patch? |
dhellmann
left a comment
There was a problem hiding this comment.
This version is looking pretty good.
I think we can improve the validation by using kubebuilder features. I have tried to give an example in an inline comment.
I would like for us to add the smallest possible API for managing RAID. It is easier to add to the API than to remove from it once it is being used. If there are reasonable default behaviors, we should leave options out, for now. If there are advanced features, we may never want to include them.
I think the code to start the manual cleaning process needs to be run in more places. I have tried to explain in more detail inline.
|
|
||
| // Integer, number of disks to use for the logical disk. Defaults to minimum number of disks required | ||
| // for the particular RAID level. | ||
| NumberOfPhysicalDisks int `json:"numberOfPhysicalDisks,omitempty"` |
There was a problem hiding this comment.
When would you ever set this to anything other than the default for a given RAID mode?
There was a problem hiding this comment.
I never try to set this before.
There was a problem hiding this comment.
Well, you can have non-default number of disks, especially for RAID-0.
There was a problem hiding this comment.
How does the validation instruction work with this field? If the user doesn't specify a value, will we get 0? But if they do specify a value then the validation ensures that it is > 0?
There was a problem hiding this comment.
Opps, I missed that point.
In the case of user didn't specify a value, we will pass the validation of kubebuilder and got 0.
And if they specify a value, it will be validate by kubebuilder in line 129 // +kubebuilder:validation:Minimum=1
So I think pointer would be suitable in the case of missing this field
Thanks
There was a problem hiding this comment.
Unresolved this because it appears we decided on a pointer here but the field didn't actually change.
Also for software RAID we're supposed to be able to supply device hints in the same format as the root device hints we added in #495, and the DiskType and NumberOfPhysicalDisks do not apply. If we're going to keep the software RAID thing I wonder if we should separate the software and hardware RAID configurations into separate structures.
There was a problem hiding this comment.
How about this:
// RAIDConfig contains the configuration that are required to config RAID in Bare Metal server
type RAIDConfig struct {
// If true will remove existing raid configuration and set new raid configuration.
DeleteExisting bool `json:"deleteExisting,omitempty"`
// The list of logical disks for hardware RAID, first volume is root volume
HVolumes []HRAIDVolume `json:"hVolumes,omitempty"`
// The list of logical disks for software RAID, first volume is root volume, if HVolume is setted this items will be invalid.
// The number of created Software RAID devices must be 1 or 2.
// If there is only one Software RAID device, it has to be a RAID-1.
// If there are two, the first one has to be a RAID-1, while the RAID level for the second one can 0, 1, or 1+0.
// As the first RAID device will be the deployment device,
// enforcing a RAID-1 reduces the risk of ending up with a non-booting node in case of a disk failure.
SVolumes []SRAIDVolume `json:"sVolumes,omitempty"`
}
48ef146 to
e41d52a
Compare
|
@dhellmann , @dtantsur : Thank for your comments |
3ab9a5e to
d7a3d5a
Compare
d7a3d5a to
6f19d1c
Compare
|
|
||
| // Integer, number of disks to use for the logical disk. Defaults to minimum number of disks required | ||
| // for the particular RAID level. | ||
| NumberOfPhysicalDisks int `json:"numberOfPhysicalDisks,omitempty"` |
6f19d1c to
abd2434
Compare
longkb
left a comment
There was a problem hiding this comment.
@dhellmann Thanks for you reviewing. I have pushed some revision in this PR :)
abd2434 to
2368edb
Compare
2368edb to
922ba8a
Compare
922ba8a to
78b3314
Compare
78b3314 to
f685941
Compare
f685941 to
d592398
Compare
|
|
||
| // The valid values must be "hdd" or "ssd". If this is not specified, disk type will not be a criterion to find backing physical disks | ||
| // +kubebuilder:validation:Enum=hdd,ssd | ||
| DiskType nodes.DiskType `json:"diskType,omitempty"` |
There was a problem hiding this comment.
If this field is optional, don't we need for it to be a pointer? Or does DiskType have a good zero value?
There was a problem hiding this comment.
Although Ironic will handle the zero value for DiskType, but I think it would be better if we make it to be a pointer
Thanks
There was a problem hiding this comment.
This still needs to be addressed. What is the zero value for nodes.DiskType? What default do we want for this field?
There was a problem hiding this comment.
I have no idea, but in OpenStack Docs, a part of example haven't set disk_type, and i find some explanation.
disk_type - hdd or ssd. If this is not specified, disk type will not be a criterion to find backing physical disks.
There was a problem hiding this comment.
Or we can check DiskType, if DiskType is empty return error.
There was a problem hiding this comment.
It's just a string so the zero value is an empty string 🤷♂️
There was a problem hiding this comment.
We probably do have to specifically allow an empty value in the kubebuilder validation of the previous line though I think?
Everywhere else in the API (HardwareDetails and root device hints) we have had a bool named Rotational instead of the strings "hdd" and "sdd". Let's do that here also and make this a bool pointer.
|
|
||
| // Integer, number of disks to use for the logical disk. Defaults to minimum number of disks required | ||
| // for the particular RAID level. | ||
| NumberOfPhysicalDisks int `json:"numberOfPhysicalDisks,omitempty"` |
There was a problem hiding this comment.
How does the validation instruction work with this field? If the user doesn't specify a value, will we get 0? But if they do specify a value then the validation ensures that it is > 0?
d592398 to
bd43382
Compare
dtantsur
left a comment
There was a problem hiding this comment.
I don't think I can lgtm here, but looks good
| } | ||
|
|
||
| // HardwareRAIDVolume defines the desired configuration of volume in hardware RAID | ||
| type HardwareRAIDVolume struct { |
There was a problem hiding this comment.
nit: we should probably move this to GopherCloud eventually
| SizeGibibytes *int `json:"sizeGibibytes,omitempty"` | ||
|
|
||
| // RAID level for the logical disk. The following levels are supported: 0;1;1+0. | ||
| // +kubebuilder:validation:Enum="0";"1";"1+0" |
There was a problem hiding this comment.
Yep (I'm fine with a follow-up)
| SizeGibibytes *int `json:"sizeGibibytes,omitempty"` | ||
|
|
||
| // RAID level for the logical disk. The following levels are supported: 0;1;1+0. | ||
| // +kubebuilder:validation:Enum="0";"1";"1+0" |
There was a problem hiding this comment.
Let's probably open another patch and discuss there.
| host.Status.Provisioning.RAID = &metal3v1alpha1.RAIDConfig{} | ||
| dirty = true | ||
| } | ||
| // If HardwareRAIDVolumes isn't nil, we will ignore SoftwareRAIDVolumes. |
There was a problem hiding this comment.
Let's research this option in a follow-up.
| } | ||
|
|
||
| // A private method to build RAID disks | ||
| func buildTargetRAIDCfg(raid *metal3v1alpha1.RAIDConfig) (logicalDisks []nodes.LogicalDisk, err error) { |
There was a problem hiding this comment.
nit: maybe make this public for reuse? e.g. we use these bits in openshift installer (if you care about it).
| } | ||
|
|
||
| // buildRAIDCleanSteps build the clean steps for RAID configuration from BaremetalHost spec | ||
| func buildRAIDCleanSteps(raid *metal3v1alpha1.RAIDConfig) (cleanSteps []nodes.CleanStep) { |
|
/test-integration |
|
/hold cancel |
Signed-off-by: zouyu <zouy.fnst@cn.fujitsu.com>
|
/test-integration |
|
@andfasano @dtantsur @zaneb PTAL. |
| if p.bmcAccess.RAIDInterface() != "" { | ||
| cleanSteps = append(cleanSteps, BuildRAIDCleanSteps(p.host.Status.Provisioning.RAID)...) | ||
| } else if p.host.Status.Provisioning.RAID != nil { | ||
| return nil, fmt.Errorf("RAID settings are exist, but the node's driver %s does not support RAID", p.bmcAccess.Driver()) |
There was a problem hiding this comment.
typo: I guess it's RAID settings are defined
| // Build manual clean steps | ||
| cleanSteps, err := p.buildManualCleaningSteps() | ||
| if err != nil { | ||
| result, err = transientError(err) |
There was a problem hiding this comment.
A transientError produces an immediate re-enqueue of the reconcile loop. IIUC since in this case the error generated by p.buildManualCleaningSteps() is synchronous (and could maybe require a human intervention to be fixed), I'd suggest to use operationFailed() since it will activate the retry mechanism with backoff
| var cleanSteps []nodes.CleanStep | ||
| cleanSteps, err = p.buildManualCleaningSteps() | ||
| if err != nil { | ||
| result, err = transientError(err) |
| }, | ||
| }, | ||
| } | ||
| } |
There was a problem hiding this comment.
That's really a fragile approach, and moreover hides the setup in the cases list. In other situation we've used the builder pattern to allow each case to specify its own settings (as for the testserver), here's not yet available for the host, but at least an enrichment function can be added in the test struct, to be invoked on the host only when defined
There was a problem hiding this comment.
Thanks, I added a flag existRaidConfig to test cases.
There was a problem hiding this comment.
A good step in the right direction (at least more robust than before), even though the single test case readability is still not optimal (a builder approach would be definitely better); not anymore a blocking point for me though
Signed-off-by: zouyu <zouy.fnst@cn.fujitsu.com>
Signed-off-by: zouyu <zouy.fnst@cn.fujitsu.com>
| }, | ||
| }, | ||
| } | ||
| } |
There was a problem hiding this comment.
A good step in the right direction (at least more robust than before), even though the single test case readability is still not optimal (a builder approach would be definitely better); not anymore a blocking point for me though
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andfasano, dtantsur, longkb, zaneb The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@dhellmann @zaneb @andfasano and @dtantsur have approved this PR, is there anything else block the PR from being merged? |
|
/test-integration |
|
@andfasano @dtantsur I think this needs an LGTM to proceed. |
|
/lgtm I don't think I actually have LGTM rights on this repository, but let's try. |
|
@dtantsur: adding LGTM is restricted to approvers and reviewers in OWNERS files. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
| maxLength: 64 | ||
| type: string | ||
| numberOfPhysicalDisks: | ||
| description: Integer, number of disks to use for the logical disk. Defaults to minimum number of disks required for the particular RAID level. |
There was a problem hiding this comment.
| description: Integer, number of disks to use for the logical disk. Defaults to minimum number of disks required for the particular RAID level. | |
| description: Integer, number of physical disks to use for the logical disk. Defaults to minimum number of disks required for the particular RAID level. |
| - 1+0 | ||
| type: string | ||
| physicalDisks: | ||
| description: A list of device hints, the number of item should be greater than or equal to 2. |
There was a problem hiding this comment.
| description: A list of device hints, the number of item should be greater than or equal to 2. | |
| description: A list of device hints, the number of items should be greater than or equal to 2. |
| - 1+0 | ||
| type: string | ||
| physicalDisks: | ||
| description: A list of device hints, the number of item should be greater than or equal to 2. |
There was a problem hiding this comment.
| description: A list of device hints, the number of item should be greater than or equal to 2. | |
| description: A list of device hints, the number of items should be greater than or equal to 2. |
| - 1+0 | ||
| type: string | ||
| physicalDisks: | ||
| description: A list of device hints, the number of item should be greater than or equal to 2. |
There was a problem hiding this comment.
| description: A list of device hints, the number of item should be greater than or equal to 2. | |
| description: A list of device hints, the number of items should be greater than or equal to 2. |
| - 1+0 | ||
| type: string | ||
| physicalDisks: | ||
| description: A list of device hints, the number of item should be greater than or equal to 2. |
There was a problem hiding this comment.
| description: A list of device hints, the number of item should be greater than or equal to 2. | |
| description: A list of device hints, the number of items should be greater than or equal to 2. |
| maxLength: 64 | ||
| type: string | ||
| numberOfPhysicalDisks: | ||
| description: Integer, number of disks to use for the logical disk. Defaults to minimum number of disks required for the particular RAID level. |
There was a problem hiding this comment.
| description: Integer, number of disks to use for the logical disk. Defaults to minimum number of disks required for the particular RAID level. | |
| description: Integer, number of physical disks to use for the logical disk. Defaults to minimum number of disks required for the particular RAID level. |
| maxLength: 64 | ||
| type: string | ||
| numberOfPhysicalDisks: | ||
| description: Integer, number of disks to use for the logical disk. Defaults to minimum number of disks required for the particular RAID level. |
There was a problem hiding this comment.
| description: Integer, number of disks to use for the logical disk. Defaults to minimum number of disks required for the particular RAID level. | |
| description: Integer, number of physical disks to use for the logical disk. Defaults to minimum number of disks required for the particular RAID level. |
| maxLength: 64 | ||
| type: string | ||
| numberOfPhysicalDisks: | ||
| description: Integer, number of disks to use for the logical disk. Defaults to minimum number of disks required for the particular RAID level. |
There was a problem hiding this comment.
| description: Integer, number of disks to use for the logical disk. Defaults to minimum number of disks required for the particular RAID level. | |
| description: Integer, number of physical disks to use for the logical disk. Defaults to minimum number of disks required for the particular RAID level. |
|
/lgtm @demonCoder95 @Hellcatlk there are still a couple of points pending, but they could be addressed in a followup:
|
|
@andfasano I submitted a PR to resolve the comments left by @rdoxenham. |
Support RAID configuration for baremetal server
Currently, Metal3 does not support deploy baremetal server with RAID configuration (#206).
This PR aims to extend Ironic Provisioner to support RAID configuration with the bellows:
raidproperty in host's spec.manual cleaningto configraidinprovisioningphase.